Search CORE

42 research outputs found

An Analysis of Publication Venues for Automatic Differentiation Research

Author: Baydin Atilim Gunes
Pearlmutter Barak A.
Publication venue
Publication date: 01/01/2014
Field of study

We present the results of our analysis of publication venues for papers on automatic differentiation (AD), covering academic journals and conference proceedings. Our data are collected from the AD publications database maintained by the autodiff.org community website. The database is purpose-built for the AD field and is expanding via submissions by AD researchers. Therefore, it provides a relatively noise-free list of publications relating to the field. However, it does include noise in the form of variant spellings of journal and conference names. We handle this by manually correcting and merging these variants under the official names of corresponding venues. We also share the raw data we get after these corrections.Comment: 6 pages, 3 figure

arXiv.org e-Print Archive

MURAL - Maynooth University Research Archive Library

NUI Maynooth Eprint Archive

Maynooth University ePrints and eTheses Archive

Automatic Differentiation of Algorithms for Machine Learning

Author: Baydin Atilim Gunes
Pearlmutter Barak A.
Publication venue
Publication date: 01/01/2014
Field of study

Automatic differentiation---the mechanical transformation of numeric computer programs to calculate derivatives efficiently and accurately---dates to the origin of the computer age. Reverse mode automatic differentiation both antedates and generalizes the method of backwards propagation of errors used in machine learning. Despite this, practitioners in a variety of fields, including machine learning, have been little influenced by automatic differentiation, and make scant use of available tools. Here we review the technique of automatic differentiation, describe its two main modes, and explain how it can benefit machine learning practitioners. To reach the widest possible audience our treatment assumes only elementary differential calculus, and does not assume any knowledge of linear algebra.Comment: 7 pages, 1 figur

arXiv.org e-Print Archive

MURAL - Maynooth University Research Archive Library

NUI Maynooth Eprint Archive

Maynooth University ePrints and eTheses Archive

Automated Generation of Cross-Domain Analogies via Evolutionary Computation

Author: Baydin Atilim Gunes
de Mantaras Ramon Lopez
Ontanon Santiago
Publication venue
Publication date: 01/01/2012
Field of study

Analogy plays an important role in creativity, and is extensively used in science as well as art. In this paper we introduce a technique for the automated generation of cross-domain analogies based on a novel evolutionary algorithm (EA). Unlike existing work in computational analogy-making restricted to creating analogies between two given cases, our approach, for a given case, is capable of creating an analogy along with the novel analogous case itself. Our algorithm is based on the concept of "memes", which are units of culture, or knowledge, undergoing variation and selection under a fitness measure, and represents evolving pieces of knowledge as semantic networks. Using a fitness function based on Gentner's structure mapping theory of analogies, we demonstrate the feasibility of spontaneously generating semantic networks that are analogous to a given base network.Comment: Conference submission, International Conference on Computational Creativity 2012 (8 pages, 6 figures

arXiv.org e-Print Archive

CiteSeerX

Western Sydney ResearchDirect

Evolution of Ideas: A Novel Memetic Algorithm Based on Semantic Networks

Author: Baydin Atilim Gunes
de Mantaras Ramon Lopez
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 17/01/2012
Field of study

This paper presents a new type of evolutionary algorithm (EA) based on the concept of "meme", where the individuals forming the population are represented by semantic networks and the fitness measure is defined as a function of the represented knowledge. Our work can be classified as a novel memetic algorithm (MA), given that (1) it is the units of culture, or information, that are undergoing variation, transmission, and selection, very close to the original sense of memetics as it was introduced by Dawkins; and (2) this is different from existing MA, where the idea of memetics has been utilized as a means of local refinement by individual learning after classical global sampling of EA. The individual pieces of information are represented as simple semantic networks that are directed graphs of concepts and binary relations, going through variation by memetic versions of operators such as crossover and mutation, which utilize knowledge from commonsense knowledge bases. In evaluating this introductory work, as an interesting fitness measure, we focus on using the structure mapping theory of analogical reasoning from psychology to evolve pieces of information that are analogous to a given base information. Considering other possible fitness measures, the proposed representation and algorithm can serve as a computational tool for modeling memetic theories of knowledge, such as evolutionary epistemology and cultural selection theory.Comment: Conference submission, 2012 IEEE Congress on Evolutionary Computation (8 pages, 7 figures

arXiv.org e-Print Archive

Crossref

Using Synthetic Data to Train Neural Networks is Model-Based Reasoning

Author: Baydin Atilim Gunes
Le Tuan Anh
Wood Frank
Zinkov Robert
Publication venue
Publication date: 01/01/2017
Field of study

We draw a formal connection between using synthetic training data to optimize neural network parameters and approximate, Bayesian, model-based reasoning. In particular, training a neural network using synthetic data can be viewed as learning a proposal distribution generator for approximate inference in the synthetic-data generative model. We demonstrate this connection in a recognition task where we develop a novel Captcha-breaking architecture and train it using synthetic data, demonstrating both state-of-the-art performance and a way of computing task-specific posterior uncertainty. Using a neural network trained this way, we also demonstrate successful breaking of real-world Captchas currently used by Facebook and Wikipedia. Reasoning from these empirical results and drawing connections with Bayesian modeling, we discuss the robustness of synthetic data results and suggest important considerations for ensuring good neural network generalization when training with synthetic data.Comment: 8 pages, 4 figure

arXiv.org e-Print Archive

Crossref

Oxford University Research Archive

Towards 3D Retrieval of Exoplanet Atmospheres: Assessing Thermochemical Equilibrium Estimation Methods

Author: Baydin Atilim Gunes
Harrington Joseph
Himes Michael D.
Publication venue
Publication date: 09/04/2023
Field of study

Characterizing exoplanetary atmospheres via Bayesian retrievals requires assuming some chemistry model, such as thermochemical equilibrium or parameterized abundances. The higher-resolution data offered by upcoming telescopes enables more complex chemistry models within retrieval frameworks. Yet, many chemistry codes that model more complex processes like photochemistry and vertical transport are computationally expensive, and directly incorporating them into a 1D retrieval model can result in prohibitively long execution times. Additionally, phase-curve observations with upcoming telescopes motivate 2D and 3D retrieval models, further exacerbating the lengthy runtime for retrieval frameworks with complex chemistry models. Here, we compare thermochemical equilibrium approximation methods based on their speed and accuracy with respect to a Gibbs energy-minimization code. We find that, while all methods offer orders of magnitude reductions in computational cost, neural network surrogate models perform more accurately than the other approaches considered, achieving a median absolute dex error <0.03 for the phase space considered. While our results are based on a 1D chemistry model, our study suggests that higher dimensional chemistry models could be incorporated into retrieval models via this surrogate modeling approach.Comment: 22 pages, 14 figures, submitted to PSJ 2022/11/22, revised 2023/3/7, accepted 2023/3/23. Updated to add Zenodo link to Reproducible Research Compendiu

arXiv.org e-Print Archive

Toward 3D retrieval of exoplanet atmospheres: assessing thermochemical equilibrium estimation methods

Author: Baydin Atilim Gunes
Harrington Joseph
Himes Michael D
Publication venue: IOP Publishing
Publication date: 27/04/2023
Field of study

Characterizing exoplanetary atmospheres via Bayesian retrievals requires assuming some chemistry model, such as thermochemical equilibrium or parameterized abundances. The higher-resolution data offered by upcoming telescopes enable more complex chemistry models within retrieval frameworks. Yet many chemistry codes that model more complex processes like photochemistry and vertical transport are computationally expensive, and directly incorporating them into a 1D retrieval model can result in prohibitively long execution times. Additionally, phase-curve observations with upcoming telescopes motivate 2D and 3D retrieval models, further exacerbating the lengthy runtime for retrieval frameworks with complex chemistry models. Here we compare thermochemical equilibrium approximation methods based on their speed and accuracy with respect to a Gibbs energy-minimization code. We find that, while all methods offer orders-of-magnitude reductions in computational cost, neural network surrogate models perform more accurately than the other approaches considered, achieving a median absolute dex error of <0.03 for the phase space considered. While our results are based on a 1D chemistry model, our study suggests that higher-dimensional chemistry models could be incorporated into retrieval models via this surrogate modeling approach

Oxford University Research Archive

Automatic differentiation in machine learning: a survey

Author: Baydin Atilim Gunes
Pearlmutter Barak A.
Radul Alexey Andreyevich
Siskind Jeffrey Mark
Publication venue
Publication date: 01/01/2018
Field of study

Derivatives, mostly in the form of gradients and Hessians, are ubiquitous in machine learning. Automatic differentiation (AD), also called algorithmic differentiation or simply "autodiff", is a family of techniques similar to but more general than backpropagation for efficiently and accurately evaluating derivatives of numeric functions expressed as computer programs. AD is a small but established field with applications in areas including computational fluid dynamics, atmospheric sciences, and engineering design optimization. Until very recently, the fields of machine learning and AD have largely been unaware of each other and, in some cases, have independently discovered each other's results. Despite its relevance, general-purpose AD has been missing from the machine learning toolbox, a situation slowly changing with its ongoing adoption under the names "dynamic computational graphs" and "differentiable programming". We survey the intersection of AD and machine learning, cover applications where AD has direct relevance, and address the main implementation techniques. By precisely defining the main differentiation techniques and their interrelationships, we aim to bring clarity to the usage of the terms "autodiff", "automatic differentiation", and "symbolic differentiation" as these are encountered more and more in machine learning settings.Comment: 43 pages, 5 figure

arXiv.org e-Print Archive

MURAL - Maynooth University Research Archive Library

NUI Maynooth Eprint Archive

Maynooth University ePrints and eTheses Archive

Oxford University Research Archive

Tricks from Deep Learning

Author: Baydin Atilim Gunes
Pearlmutter Barak A.
Siskind Jeffrey Mark
Publication venue
Publication date: 01/01/2016
Field of study

The deep learning community has devised a diverse set of methods to make gradient optimization, using large datasets, of large and highly complex models with deeply cascaded nonlinearities, practical. Taken as a whole, these methods constitute a breakthrough, allowing computational structures which are quite wide, very deep, and with an enormous number and variety of free parameters to be effectively optimized. The result now dominates much of practical machine learning, with applications in machine translation, computer vision, and speech recognition. Many of these methods, viewed through the lens of algorithmic differentiation (AD), can be seen as either addressing issues with the gradient itself, or finding ways of achieving increased efficiency using tricks that are AD-related, but not provided by current AD systems. The goal of this paper is to explain not just those methods of most relevance to AD, but also the technical constraints and mindset which led to their discovery. After explaining this context, we present a "laundry list" of methods developed by the deep learning community. Two of these are discussed in further mathematical detail: a way to dramatically reduce the size of the tape when performing reverse-mode AD on a (theoretically) time-reversible process like an ODE integrator; and a new mathematical insight that allows for the implementation of a stochastic Newton's method

arXiv.org e-Print Archive

MURAL - Maynooth University Research Archive Library

NUI Maynooth Eprint Archive

Maynooth University ePrints and eTheses Archive